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Abstract — In this paper, we present an analytical 
analysis of the convergence of raptor codes under 
joint decoding over the binary input additive white 
noise channel (BIAWGNC), and derive an optimization 
method. We use Information Content evolution under 
Gaussian approximation, and focus on a new decoding 
scheme that proves to be more efficient: the joint de- 
coding of the two code components of the raptor code. 
In our general model, the classical tandem decoding 
scheme appears to be a subcase, and thus, the design 
of LT codes is also possible. 

Keywords: Raptor code, joint decoding, optimization 
of distribution 

I. Introduction 

Fountain codes were originally introduced [1] to trans- 
mit efficiently over an erasure channel with unknown 
erasure probability. They are of great interest for multicast 
or peer to peer applications, and when no feedback channel 
is available. 

LT codes are the first class of efficient fountain codes, 
introduced by Luby [2] . An LT code produces a potentially 
limitless number of independent output symbols according 
to an output degree distribution. LT codes are proved to be 
asymptotically capacity achieving on the Binary Erasure 
Channel (BEC) [2], [3]. High performance is achieved by 
designing good output degree distributions. In order to 
obtain arbitrary small decoding failure probability, the 
average degree of the output symbols has to grow at least 
logarithmically with k, the number of input symbols. Thus, 
performance is achieved at a decoding cost growing in 
0(fclog(fc)). This complexity is too high to ensure linear 
encoding and decoding time which is a desired property 
for practical codes. 

Raptor codes are a class of fountain codes introduced 
by Shokrollahi in [3] as an extention of LT codes. A 
raptor code is the concatenation of an LT code and an 
outer code, called precodc. The precode is a very high 
rate error correcting block code. Thus, the condition of 
recovering each and every input symbol with arbitrarily 
high probability can be relaxed: the LT code needs to 
recover a large enough proportion of input symbols, and 
the precode is in charge of recovering the fraction of 
input symbols unrecovered by the LT code. This enables 
the design of degree distributions of constant mean i.e. 
linear encoding and decoding time. In [4], the author 
independently presented the idea of prccoding to obtain 



linear decoding time codes. Recently in [5], the results over 
the BEC of [3] were extended to general binary memorylcss 
symmetric channels. 

In all the previously proposed approaches, the LT code 
and the precode are decoded separately. In this paper, we 
consider another decoding scheme: the joint decoding of 
the two code components. The main idea behind joint 
decoding is that the precode can help the LT code to 
converge, by providing extrinsic information. By taking 
into account the information provided by the precode, 
the optimization problem of an LT code becomes less 
constrained, and for a given precode, the total achievable 
rate of the raptor code becomes closer to the channel 
capacity. 

In this paper, we develop the asymptotic analysis of the 
joint decoder, and propose an optimization method for the 
design of efficient degree distributions. For this purpose, 
we use a fully analytical approach: information content 
(IC) evolution under Gaussian approximation (GA). We 
introduce the extrinsic transfer function of the precode 
into the equations, which leads to a new model that takes 
into account the information provided by the precode. In 
our analysis, the classical separate decoder appears to be a 
sub-case of the joint decoder, by assuming that no extrinsic 
information is passed from the precode to the LT code. 

The remainder of this paper is organized as follows: In 
section [Til we descibe the system that we consider and 
give the notations used in the paper. In section IIIH we 
study the asymptotic performance of raptor codes on the 
BIAWGNC, state the optimization problem for the design 
of output degree distributions, and analyze the main 
design parameters. In section IIV1 wc show experimental 
results. 

II. System description and notations 
A. LT codes and raptor codes 

We call input symbols the set of information symbols to 
be transmitted and output symbols the symbols produced 
by an LT code from the input symbols. An LT code is 
described by its output degree distribution. To generate an 
output symbol, a degree d is sampled from that distri- 
bution, independently from the past samples. The output 
symbol is then formed as the binary sum of a uniformly 
randomly chosen subset of size d of the input symbols: the 
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Fig. 1. Description of a raptor code: Tanner graph of an LT code 
+ precode. The black squares represent the parity check nodes and 
the circles represent variable nodes associated with input symbols or 
output symbols. 



d input symbols and the output symbols verify a parity 
check equation. 

Let fii, fl2, . . . , fid e be the distribution weights on 
1, 2, . . . , d c so that £1^ denotes the probability of choos- 
ing the value d under this distribution. We denote the 
output degree distribution using its generator polynomial: 
f2(.x) = Y)j—i QjX 1 , which is associated with the corre- 
sponding edge degree distribution w(x) = £^i=i CL>iX i ~ 1 = 

n'(x)/fi'(i). 

Because input symbols are chosen uniformly at random, 
their node degree distribution is binomial, and can be 
approximated by a Poisson distribution with parameter 
a [3]. Thus, we have the polynomial that describes the 
input symbols degree distribution defined as: 



I(x) 



O ot(x-1) 



Moreover, the associated input edge degree distribution 
l(x) = J2t=i = also equals e"^" 1 ). Both 

distributions are of mean a. 

Input symbols are not transmitted over the channel. At 
the receiver side, we have noisy observations of the output 
symbols, and belief propagation (BP) decoding is used to 
recover iteratively the input symbols. 

A raptor code is an LT code concatenated with an outer 
code called "precode". The input symbols of the LT code 
are then formed by a codeword of the precode. 

Although fountain codes are rateless, we can define the 
a posteriori rate R of an LT code as follows: 



Rlt = 



Nb input symbols 



Nb output symbols needed for successfull decoding 
fi'(l) 



(1) 



As for LDPC codes, a raptor code can be represented 
by a Tanner graph. A Tanner graph is a bipartite rep- 
resentation of a system composed of data nodes and 
function nodes. Here, the data nodes represent input or 
output symbols and the function nodes represent how their 
adjacent data nodes interact through parity checks. The 
edges on the graph carry probability messages that come 
in or out of the data nodes. The Tanner graph of a raptor 
code is given in Fig. [TJ 



B. Tandem and joint decoding of a raptor code 

The classical"Tandem decoding" (TD) consists of decod- 
ing the LT code first and then using the extrinsic informa- 
tion about the input symbols as a priori information for 
the precode. 

For "Joint decoding" (JD), one decoding iteration con- 
sists of alternating Nn decoding iterations on the LT 
code, and N p decoding iterations on the precode. Thus, 
both code components of the raptor code provide extrinsic 
information to each other. In the sequel, we shall only 
consider the case where Nu = N p = 1, and where the 
precode is an LDPC code. In this particular case, the 
raptor code can be described by a single Tanner graph 
with two kinds of parity check nodes : check nodes of 
the precode, referred to as "static check nodes" and parity 
check nodes of the LT code, later referred to as "dynamic 
check nodes". 

Because the precode provides extrinsic information to 
the LT code, we need to introduce the extrinsic transfer 
function of the precode, denoted by x i— > T[x), into the IC 
evolution equations. 

III. Asymptotic analysis and design of raptor 

CODES FOR THE BIAWGNC 

In this section, we derive the asymptotic analysis of a 
raptor under JD. Thus we assume that extrinsic informa- 
tion is exchanged between the precode and the fountain 
at each decoding iteration. The analysis will be presented 
from the fountain point of view, and we will track the 
evolution of the IC of the messages that are related 
to the fountain part of the Tanner graph. Indeed, our 
objective is to optimize the distribution of the fountain 
part of the raptor code, namely u>{x), taking into account 
the contribution of the precode through its IC transfer 
function. 

For our study, we use IC evolution under GA and tree- 
like assumption. This allows us to keep a fully analytical 
and monodimcnsionnal approach, without the need for 
Monte Carlo simulations as done in [5], thus leading to a 
more computationnally efficient optimization. IC evolution 
is a concurrent tool of mean evolution under GA [6], that 
has been proved to be more accurate and robust for the 
optimization of LDPC/IRA codes [7]. 

The messages on the decoding graph are the log density 
ratios (LDR) of the probability weights. They are modeled 
by a random variable which is assumed to be Gaussian 
distributed with mean m and variance a 2 = 2m [6]. Thus, 
the density of the messages is symmetric [8] . For a message 
sampled from such a symmetric Gaussian distribution, the 
IC associated to the message is x = J(m) [7], where J(.) 
is defined by: 



J{m) = I - 
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A. Asymptotic analysis of raptor codes 

1 ) Information content evolution: When the precode is 
an LDPC code with node and check edge distributions 
A (a;) and p(x), its IC transfer function [9] is given by: 



T(x) 
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i=1 



d v 



Xij( ij- 1 PjJ{u - 1)^(1 - *)) 

3=2 



(3) 



We denote x$ (resp. xi ) the IC associated to messages 
on an edge connecting a dynamic check node to an input 
symbol (resp. an input symbol to a dynamic check node) at 
the I th decoding iteration. We denote by x^t the extrinsic 
information passed by the LT code to the precode, at the 
I th decoding iteration. As the input symbols are of average 
degree a, we have: 
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The extrinsic information passed by the precode to the 
LT code is then T(x^ t ). When accounting for the transfer 
function of the precode, the IC update rules for the IC in 
the Tanner graph can be written as follows: 

4° = E aj({i - + j-\nxt t l) )i 



3 = 1 



+ /o 



(4) 
(5) 



with: 



h = .r 1 i- J 



Replacing d4j) in (J5j) gives ([6]), that describes the evo- 
lution through one joint decoding iteration of the IC 
of the LDRs at the output of the dynamic checknodes 



(fountain part): x2 = F(xu 1; , a 2 ). Note that for a given 
distribution i{x), this expression is linear with respect to 
the coefficients of w(x), which is the distribution that we 
intend to optimize. 

We point out that I© is general since it reduces to 
the classical TD case by setting the extrinsic transfer 
function to x i— ► T(x) = V.t G [0;1], thus assuming 
that no information is propagated from the precode to the 
fountain. 

2) Fixed point caracterization: In an IC evolution anal- 
ysis, the convergence is guaranteed by F(x, a 2 ) > x. Con- 
vergence continues toward a fixed point of x i— > F{x 1 a 2 ). 
Unfortunately, there are no trivial solutions for the fixed 
point of ([6]). However, using a functionnal analysis, an 
upper bound on the fixed point can be given. Replacing 
x u by 1 and using the fact that T(l) = 1 in ([6]), we obtain: 



„0-i) 



lim F(x, a 

x—*\ 



(I) 



which means that, because x i— > F(x, a 2 ) is an increasing 
function, the fixed point is necessarily less or equal than 



xq, which is the capacity of a BIAWGNC with parameter 
a 2 . Thus, the IC is upper bounded through the decoding 
iterations by xq. This gives some insights on the asymp- 
totic behavior at the decoding convergence point: the BP 
decoding of the LT part of a raptor code is limited on a 
BIAWGNC by the capacity of the channel. 

This result is not really surprising and can be inter- 
preted as follows: the output nodes of degree one have a 
constant contribution on each check node. As the iterative 
decoding process goes on, the IC of the messages at the 
output of the dynamic check nodes is limited by the 
channel observations. 

3) Starting condition: We now derive a condition for the 
beginning of the decoding process: at the first iteration, 
Xu = 0. Therefore, according to ([!]), x[P = 0. Reporting 
this in ([6]) gives: 



,(i) - 



F(0,& , )=uiJ[ -5 



The decoding process can begin iff Xu > £, for some 
arbitrary e > 0, which gives: 

Therefore, one must have U\ > for the decoding 
process to begin, and e appears to be a design parameter 
that will constrain the optimization problem, ensuring 
that u>i 7^ 0. In practice, the value of e can be chosen 
arbitrarily small. Indeed, it has been proved [5] that for 
a sequence of capacity achieving distributions w( n '(x), 
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,(n) 



0. 



Remark: as an illustration we point out that, for the 
"Ideal Soliton Distribution" introduced by Luby [2], Hi = 
1/k, which is the smallest proportion possible with k input 
symbols. 

4 ) Lower bound on the edge proportion of degree 2 output 
symbols: For an output degree distribution that is to be 
capacity achieving, we have: 

" 2 > (a-l)e-/o/4 (9) 

We only give a sketch of the proof. Let /x be defined by 
H = — x). By derivating x i— > F(x,o~ 2 ) defined in 

©, and using the approximation of the derivative of J(n) 
for large [i given in [7], we get: 



lim F'(x,a 2 ) = u 2 {a - l)e 

x— >0 



-/o/4 



Moreover, for a capacity achieving degree distribution, 
ui = [5], which means that F(0,a 2 ) = 0. Then, 
the convergence condition F(x,<7 2 ) > x implies that 
lim^^o F'(x, a 2 ) > 1, which gives the result. 

Remark: the IC evolution method leads to a slightly 
different result than the one obtained with mean evolution 
[5]. The same phenomenon has been observed for the 
derivation of the stability condition, for the optimization 
of LDPC codes. 



Jl) 



(6) 



S. Design of output degree distributions 

In this section, we explicit the optimization problem for 
the design of output degree distributions, and give some 
complementary results that we used for the choice of the 
design parameters. 

1 ) Optimization problem statement: The optimization 
of an output distribution consists of maximizing the rate 
of the corresponding LT code, i.e. maximizing f2'(l) = 
Y^i^ih which is equivalent to minimizing ^2jCOi/i. More- 
over, according to the previous section, several constraints 
must be satisfied. As uj(x) is a probability distribution, its 
coefficients must sum up to 1. We call this the proportion 
constraint [Ci]. Moreover, the convergence implies that 
F(x,a 2 ) > x. However, this inequality cannot hold for 
each and every value of x: the analysis in section IIII-A.2I 
shows that the fixed point of F(x,a 2 ) is smaller than 
xo = Jyzs)- Therefore, we must fix a margin 5 > 
away from xq. By discretizing [0;xq — 5] and requiring 
inequality to hold on the discretization points, we obtain 
a set of inequalities that need to be satisfied: they define 
the convergence constraint [C2]. The starting condition 
(H|) must also be satisfied and defines the constraint [C3]. 
Moreover, the edge proportion of output symbols must 
fullfill (JTUJ) , defining the stability constraint [C4]. Finally, 
x I— > T(x) is defined according to ([3]) for an LDPC code, 
or could be estimated with Monte Carlo simulations if 
another component code is used as a precodc. 

For a given value of a, the cost function and the con- 
straints are linear with respect to the unknown coefficients 
u>i. Therefore, the optimization of an output degree distri- 
bution can be written as a linear optimization problem 
that can be solved with linear programming. For a given 
a, the optimization problem can be stated as follows: 

UJopt(x) = argminy^— (10) 

subject to the constraints: 
[Ci] = ! 

[C 2 ] F(x, a 2 ) > x Vie [0; x - 6} for some 6 > 
[C 3 ] F(0, a 2 ) > e for some e > 
[C 4 ] F'(0,a 2 ) >1 

2) Lower bound on a: The average degree of input 
symbols a is the main design parameter. At the output 
of an LT code, the IC of messages sent from the LT code 
to the precode is given by: 

x cxt = J(aj-\x^)) (11) 

Moreover, let x p be the IC threshold above which the 
decoding of the precode is successfull. Then successfull 
decoding of the raptor code is obtained if x cx t > x p . Using 



equation Ijlip . and recalling that Xu < xq we get a lower 
bound on a: 

a > = a min (12) 

This bound can be used to limit the search space on 
a. Indeed, for increasing values of a, we optimize output 
degree distributions as explained in the previous section. 
It appears that there is a value for a that maximizes the 
corresponding rate of the LT code. 

3) ParameterS: The convergence of the LT code should 
be such that at some point of the decoding process, x cxt 
becomes larger than the precode's threshold x p . For a 
given value of a > a m i n , S is such that J^aJ^ 1 (xq —5)) > 

C. Considerations on the choice of a precode 

In this section, we discuss some important points con- 
cerning the choice of the precode, which give another 
justification to why JD should be preferred over TD in 
the perspective of designing efficient raptor codes. Let R t 
be the rate of the raptor code which is the concatenation 
of an LT code of rate Rlt, an d a precode of rate R p . We 
have: 

Rt = R p Rlt = R P ^-t (14) 
a 

For a channel with capacity C, we have Rlt < C for 
LT codes optimized for the TD scheme. Thus, R p appears 
to be a burden in terms of the total rate of the raptor 
code. Fortunately, the optimization problem becomes less 
constrained in a JD scheme, because the precode provides 
extrinsic information to the LT code, and the optimization 
for JD leads to Rlt > C, allowing the use of lower rate 
precodes than in the TD scheme. 

The use of lower rate precodes can be motivated by 
the fact that the design of very high rate LDPC codes 
is a difficult problem. Even though the optimization of 
irregularity profiles can give codes with good thresholds, 
the actual design of such codes remains difficult, because 
their underlying graph is highly connected. The higher the 
rate, the more difficult it is to design a graph with "few 
enough" short cycles. 

In the context of JD, our optimization procedure ad- 
dresses naturally the problem of the overall rate distribu- 
tion and its repartition between the fountain code and the 
precode. 

IV. Experimental results 

We define the overhead of a fountain code as e = 
(-pf ) — 1. Thus, an overhead of means that capacity 



is achieved. An overhead of 0.1 means that the rate of 
the raptor code is 10% away from the capacity. In our 
simulations, the performance of a raptor code is evaluated 
by Bit Error Rate (BER) versus overhead. 

A. LT codes 

First, we use our model to design LT codes. We recall 
that this is possible by defining the extrinsic transfer as a 
null function. 

We design an output degree distribution Q,a{x), with 
parameters 5 — 0.04 and a — 21. The optimization was 
made for a BIAWGNC of capacity C = 0.5, in order to 
compare ourselves to the distribution proposed in [5, p 
2044], referred to as SIe{x)- As we only test the LT code, 
we did not use any precodc. The simulations were set to 
k = 65000 input symbols, and 300 decoding iterations, on 
a channel of capacity C = 0.5 (a = 0.9787). 

In Fig [2 we report the BER versus overhead for LT 
codes defined by £Ia(x) and Q,e(x). Our method appears 
to be as efficient as the one proposed in [5], but it is 
computationnally more efficient, since it does not require 
Monte Carlo simulations. 
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Fig. 2. BER versus overhead for LT codes defined by the distribution 
Qa{%) optimized for a BIAWGNC of capacity C = 0.5. We compare 
our distribution to the one proposed in [5, p 2044] , denoted by f!s(rr), 
with k = 65000 input bits 

B. TD versus JD 

We now compare JD and TD schemes. We used a regular 
(3,60) LDPC precodc of length N = 65000, generated 
randomly. We compare the distribution Qe{%) proposed 
in [5, p 2044] in both TD and JD decoding schemes, to 
a distribution Qb(x) that we optimized for JD with our 
method. For the distribution He(x) there is very little 
difference between TD and JD decoding schemes. This can 
be explained by the fact that the distribution has not been 
optimized to take into account the information provided 
by the precode. For our distribution Ob(x), performance 
is improved. The effect of the precode is to help the 



convergence of the LT code, which we can interpreted 
as follows: the BER decreases with a slope, whereas for 
Qe(x), there is clearly a threshold behavior. 
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Fig. 3. BER versus overhead for a raptor code defined with a regular 
(3,60) LDPC precode. We compare Qg(x), a distribution that we 
optimized for joint decoding, to Qe(x) proposed in [5] under TD 
(blue squares) and under JD (black stars) 



V. Conclusion 

We presented the analytical analysis of raptor codes 
with IC evolution under GA, stated the optimization 
problem for the design of output degree distributions well 
adapted to joint decoding, and analyzed the main design 
parameters. Our model also allows to design efficient LT 
codes. Experimental results show that JD is more efficient 
than the classical TD decoding scheme. 
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